Modeling skewed distributions using multifractals and the law
نویسندگان
چکیده
The focus of this paper is on the charac terization of the skewness of an attribute value distribution and on the extrapolations for interesting parameters More speci cally given a vector with the highest h multiplicities m m m mh and some frequency moments Fq P mqi e g q we pro vide e ective schemes for obtaining estimates about either its statistics or subsets supersets of the relation We assume an law and speci cally a p p law This law gives a distribution which is commonly known in the fractals lit erature as multifractal We show how to estimate p from the given information rst few multiplicities and a few moments and present the results of our experimentations on real data Our results demonstrate that schemes based on our multifractal assumption consistently outperform those schemes based This work was partially supported by the National Science Foundation under Grants No CDR EEC IRI and IRI with matching funds from Em press Software Inc and Thinking Machines Inc Part of the work performed while visiting AT T Bell Laboratories Permission to copy without fee all or part of this material is granted provided that the copies are not made or distributed for direct commercial advantage the VLDB copyright notice and the title of the publication and its date appear and notice is given that copying is by permission of the Very Large Data Base Endowment To copy otherwise or to republish requires a fee and or special permission from the Endowment Proceedings of the nd VLDB Conference Mumbai Bombay India on the uniformity assumption which are com monly used in current DBMSs Moreover our schemes can be used to provide estimates for supersets of a relation which the uniformity assumption based schemes can not not provide at all
منابع مشابه
Using Weighted Distributions for Modeling Skewed, Multimodal and Truncated Data
When the observations reflect a multimodal, asymmetric or truncated construction or a combination of them, using usual unimodal and symmetric distributions leads to misleading results. Therefore, distributions with ability of modeling skewness, multimodality and truncation have been in the core of interest in statistical literature, always. There are different methods to contract ...
متن کاملModeling Fractal Structure of City-Size Distributions Using Correlation Functions
Zipf's law is one the most conspicuous empirical facts for cities, however, there is no convincing explanation for the scaling relation between rank and size and its scaling exponent. Using the idea from general fractals and scaling, I propose a dual competition hypothesis of city development to explain the value intervals and the special value, 1, of the power exponent. Zipf's law and Pareto's...
متن کاملModeling skewed distributions using multifractals and the W-20 law’
The focus of this paper is on the characterization of the skewness of an attributevalue distribution and on the extrapolations for interesting parameters. More specifically, given a vector with the highest h multiplicities ci = (rnl,rn2, . . . . mh), and some frequency moments Fp = Crnj, (e.g., q = 0,2), we provide effective schemes for obtaining estimates about either its statistics or subsets...
متن کاملModeling skewed distributions using multifractals and the ` 80 - 20 law '
The focus of this paper is on the characterization of the skewness of an attribute-value distribution and on the extrapolations for interesting parameters. More speciically, given a vector with the highest h multiplicities ~ m = (m 1 ; m 2vide eeective schemes for obtaining estimates about either its statistics or subsets/supersets of the relation. We assume an 80/20 law, and speciically, a p=(...
متن کامل